Evolutionary Patterns of the Chloroplast Genome in Vanilloid Orchids (Vanilloideae, Orchidaceae)

The Vanilloideae (vanilloids) is one of five subfamilies of Orchidaceae and is composed of fourteen genera and approximately 245 species. In this study, the six new chloroplast genomes (plastomes) of vanilloids (two Lecanorchis, two Pogonia, and two Vanilla species) were decoded, and then the evolutionary patterns of plastomes were compared to all available vanilloid plastomes. Pogonia japonica has the longest plastome, with 158,200 bp in genome size. In contrast, Lecanorchis japonica has the shortest plastome with 70,498 bp in genome size. The vanilloid plastomes have regular quadripartite structures, but the small single copy (SSC) region was drastically reduced. Two different tribes of Vanilloideae (Pogonieae and Vanilleae) showed different levels of SSC reductions. In addition, various gene losses were observed among the vanilloid plastomes. The photosynthetic vanilloids (Pogonia and Vanilla) showed signs of stage 1 degradation and had lost most of their ndh genes. The other three species (one Cyrotsia and two Lecanorchis), however, had stage 3 or stage 4 degradation and had lost almost all the genes in their plastomes, except for some housekeeping genes. The Vanilloideae were located between the Apostasioideae and Cypripedioideae in the maximum likelihood tree. A total of ten rearrangements were found among ten Vanilloideae plastomes when compared to the basal Apostasioideae plastomes. The four sub-regions of the single copy (SC) region shifted into an inverted repeat (IR) region, and the other four sub-regions of the IR region shifted into the SC regions. Both the synonymous (dS) and nonsynonymous (dN) substitution rates of IR in-cooperated SC sub-regions were decelerated, while the substitution rates of SC in-cooperated IR sub-regions were accelerated. A total of 20 protein-coding genes remained in mycoheterotrophic vanilloids. Almost all these protein genes show accelerated base substitution rates compared to the photosynthetic vanilloids. Two of the twenty genes in the mycoheterotrophic species faced strong “relaxed selection” pressure (p-value < 0.05).

Pogonia, a genus of Pogonieae, consists of five photosynthetic species [3], which are distributed in the temperate regions of East Asia and eastern North America [5]. P. japonica and P. minor are native to the sunny and wet areas of southern Korea [4]. Lecanorchis, a genus of Vanilleae, comprises 20 obligate mycoheterotrophic species [2,3], which are

General Features of Vanilloideae plastomes
The average coverage depths, voucher specimens, DNA numbers, and lengths of assembled plastomes are given in Table 1, and the NGS results are given in Supplementary Table S1. The coverage depth of newly sequenced species ranged from 76.9× (for Lecanorchis japonica) to 4394× (for Vanilla madagascariensis). The plastome size of L. japonica was the shortest, with 70,498 bp, and the size of Pogonia japonica was the longest, with 158,200 bp. Almost all plastomes have a quadripartite structure, which is composed of an large single copy (LSC), an SSC, and two IRs'. The size variations of the plastome regions of ten Vanilloideae and four Apostasioideae species are graphically presented in Figure 1. One noticeable size variation is the extensive reduction of the SSC region in the photosynthetic vanilloids. The SSC region ranged from 1254 bp in V. madagascarensis to 5885 bp in P. japonica. The annotated plastome maps of seven vanilloids belonging to the genera of Cyrtosia, Lecanorchis, Pogonia, and Vanilla are shown in Figure 2.    Almost all the genes from the nonphotosynthetic vanilloid species were found to be pseudogenized or lost. For example, all photosynthesis-related genes were lost from the plastomes of two mycoheterotrophic Lecanorchis species. Only 32 housekeeping genes (accD, clpP, inf A, matK, rpl2, rpl14, rpl16, rpl20, rpl36, rps2, rps3, rps4, rps7, rps8, rps11, rps12, rps14, rps18, rps19, 16S rRNA, 23S rRNA, 4.5S rRNA, 5S rRNA, trnC-GCA, trnD-GUC, trnF-GAA, trnfM-CAU, trnI-CAU, trnN-GUU, trnQ-UUG, ycf 1, and ycf 2) are present in L. japonica. In addition to the 32 genes, the trnE-UUC gene is present in the L. kiusiana plastome. The plastome of heterotrophic Cyrtosia showed intermediate gene contents between Vanilla and Lecanorchis. A few photosynthesis-related genes or pseudogenes still remain in the Cyrtosia plastome, while all photosynthesis-related genes have been lost in Lecanorchis. The detailed gene contents of vanilloids and the other related orchids are given in Supplementary Figure S1.

Comparative and Phylogenetic Analyses
In order to validate the structural changes in the vanilloid plastome, the SC-IR junctions were analyzed among 14 Vanilloideae and Apostasioideae species using an IRScope. In the photosynthetic species, the LSC-IRa and LSC-IRb junctions are located within the rpl22 gene and the intergenic spacer between trnH and psbA, respectively. In the nonphotosynthetic species, the LSC-IRa junction is located within rps19 (Cyrtosia) or rpl2 (Lecanorchis), and the LSC-IRb junctions are located near the matK gene (Supplementary Figure S2). In the typical orchid plastome, the SSC-IRa and SSC-IRb junctions are located within the ndhF gene and the ycf 1 gene, respectively. The SSC-IRa/b junctions of the four Apostasioideae species follow the same typical orchid form. Both the SSC-IRa and the SSC-IRb junctions in Lecanorchis are also located within ycf 1. However, these junctions occurred within ccsA (Vanilla) and ndhA (Pogonia) in photosynthetic vanilloids. In the Cyrtosia plastome, the SSC-IRa and the SSC-IRb junctions are located near rrn16 and ndhB, respectively.
The gene orders of 14 plastomes were analyzed using progressiveMauve. A total of ten gene rearrangement events were found ( Table 2). The longest inversion event occurred between trnS-GCU and ycf 3 in the LSC region, with a length of 58.591 kb. The shortest rearrangement occurred between ycf3 and trnS-GGA, and the inverted length was 3.282 kb. In addition to the 10 gene rearrangement events, several gene relocations occurred in mycoheterotrophic vanilloids. For example, the 16S rRNA, 23S rRNA, 4.5S rRNA, and 5S rRNA genes are usually positioned in the IR region. However, those four genes were in the SSC region in L. kiusiana and L. japonica. Table 2. Rearrangement blocks in the result of progressiveMauve among fourteen species (ten Vanilloideae species and four Apostasioideae species were used). The phylogenetic relationship of the vanilloid species was identified using a maximum likelihood (ML) tree and a Bayesian inference tree, which were constructed using 83 genes ( Figure 3 and Supplementary Figure S3). Both trees support the relationship of Apostasioideae (Vanilloideae(Cypripedioideae(Orchidoideae,Epidendroideae))). Almost all nodes were supported by a 100% bootstrap value and a 1.0 Bayesian posterior probability. In the subfamily Vanilloideae, two monophyletic groups, corresponding to the tribe Pogonieae and the tribe Vanilleae, were recognized. In the tribe Vanilleae, two clades corresponding to the photosynthetic group (Vanilla) and mycoheterotrophic group (Cyrtosia and Lecanorchis) were further identified. The Vanilla clade was further divided into two subclades corresponding to leafless species (V. aphylla and V. madagascariensis) and leafy species (V. pompona and V. planitifolia). The branch lengths leading to the mycoheterotrophic vanilloids are relatively longer than the branch lengths leading to the photosynthetic vanilloids in the ML tree.

Evolutionary Rate Comparison
The SSC contraction, mycoheterotrophic life cycle, and rearranged plastome regions are notable features in Vanilloideae plastomes because they are not found in other typical orchid plastomes. The regions that survived in a plastome after a mycoheterotrophic event of Cyrtosia and Lecanorchis are listed. The rearranged regions are also listed. These listed regions were tested to calculate the nonsynonymous/synonymous substitution rates, and the results are listed (Figures 4 and 5

Evolutionary Rate Comparison
The SSC contraction, mycoheterotrophic life cycle, and rearranged plastome regions are notable features in Vanilloideae plastomes because they are not found in other typical orchid plastomes. The regions that survived in a plastome after a mycoheterotrophic event of Cyrtosia and Lecanorchis are listed. The rearranged regions are also listed. These listed regions were tested to calculate the nonsynonymous/synonymous substitution rates, and the results are listed ( Figure 4, Figure 5, Supplementary Table S2). Comparison of synonymous (dS) and nonsynonymous (dN) substitution rates between species with rearranged and conserved plastomes in eight regions. A region with a significant pvalue (p-value < 0.05) was tagged by an asterisk (*) in its gene name. The red color in the bar graph indicates an acceleration of the substitution rates; the blue color indicates a deceleration of the substitution rates.
The genetic regions that were relocated from the typical Orchidaceae plastome to the Vanilloideae plastome are called rearranged regions in this study. There are eight rearranged regions: ccsA, rpl2, rpl22, rps7, rps12-3′, rps15, rps19, and ycf1. Among the eight regions, four (rpl2, rps7, rps12-3′, and rps19) were moved from the IR region to the SC region. The other four regions (ccsA, rpl22, rps15, and ycf1) were moved from the SC region to the IR region. In the case of nonsynonymous substitution rates (dN), the difference between the dN values of the rearranged group and the typical group does not seem related to the region shift. Meanwhile, the synonymous substitution rates (dS) value of the rearranged group differed from the typical group for regions that were shifted to IR. The IR-shifted group showed a much lower dS value than the typical group ( Figure 4). A log-likelihood test (LRT) was performed between the null model and the alternative model. The ycf1 region was the only statistically significant region among the eight shifted regions (p-value < 0.05).
There are twenty conserved regions in the Vanilloideae mycoheterotrophic plastome: clpP, infA, matK, rpl2, rpl14, rpl16, rpl20, rpl36, rps2, rps3, rps4, rps7, rps8, rps11, rps14, rps18, Figure 4. Comparison of synonymous (dS) and nonsynonymous (dN) substitution rates between species with rearranged and conserved plastomes in eight regions. A region with a significant p-value (p-value < 0.05) was tagged by an asterisk (*) in its gene name. The red color in the bar graph indicates an acceleration of the substitution rates; the blue color indicates a deceleration of the substitution rates.
The genetic regions that were relocated from the typical Orchidaceae plastome to the Vanilloideae plastome are called rearranged regions in this study. There are eight rearranged regions: ccsA, rpl2, rpl22, rps7, rps12-3 , rps15, rps19, and ycf 1. Among the eight regions, four (rpl2, rps7, rps12-3 , and rps19) were moved from the IR region to the SC region. The other four regions (ccsA, rpl22, rps15, and ycf 1) were moved from the SC region to the IR region. In the case of nonsynonymous substitution rates (dN), the difference between the dN values of the rearranged group and the typical group does not seem related to the region shift. Meanwhile, the synonymous substitution rates (dS) value of the rearranged group differed from the typical group for regions that were shifted to IR. The IR-shifted group showed a much lower dS value than the typical group ( Figure 4). A log-likelihood test (LRT) was performed between the null model and the alternative model. The ycf 1 region was the only statistically significant region among the eight shifted regions (p-value < 0.05). rps19, rps12-3′, rps12-5′, and ycf2 ( Figure 5). For all those regions, the dN and dS values were calculated using a PAML. As a result, the dN and dS values of nineteen regions, except for the dS value of clpP, were higher than regions from photosynthetic plastome. LRT was also performed between the null model and the alternative model so that five regions (clpP, rpl16, rps7, rps12-5′, and ycf2) were statistically significant regions among twenty conserved regions (p-value < 0.05). RELAX was performed using twenty-eight regions to check which evolutionary pressure was applied to the plastome. As a result, a total of six regions (clpP and ycf2 of the mycoheterotrophic conserved regions; rpl2, rps12-5′, rps15, and ycf2 of the rearranged region) were statistically significant (p-value < 0.05). Only the rpl2 region showed an intensification selection, while the other five regions showed a relaxation selection among statistically significant regions (Table 3).  There are twenty conserved regions in the Vanilloideae mycoheterotrophic plastome: clpP, inf A, matK, rpl2, rpl14, rpl16, rpl20, rpl36, rps2, rps3, rps4, rps7, rps8, rps11, rps14, rps18, rps19, rps12-3 , rps12-5 , and ycf 2 ( Figure 5). For all those regions, the dN and dS values were calculated using a PAML. As a result, the dN and dS values of nineteen regions, except for the dS value of clpP, were higher than regions from photosynthetic plastome. LRT was also performed between the null model and the alternative model so that five regions (clpP, rpl16, rps7, rps12-5 , and ycf 2) were statistically significant regions among twenty conserved regions (p-value < 0.05).
RELAX was performed using twenty-eight regions to check which evolutionary pressure was applied to the plastome. As a result, a total of six regions (clpP and ycf 2 of the mycoheterotrophic conserved regions; rpl2, rps12-5 , rps15, and ycf 2 of the rearranged region) were statistically significant (p-value < 0.05). Only the rpl2 region showed an intensification selection, while the other five regions showed a relaxation selection among statistically significant regions (Table 3).

Plastome Structure Evolution
The SSC contraction is an interesting feature of the Vanilloideae plastome (Figures 1  and 2). Though an SSC in a typical Orchidaceae is 10-18 kb in length, it ranges from 1.2-17 kb in length in the Vanilloideae plastome. However, the whole length of the Vanilloideae plastome is not much different from the typical Orchidaceae (Supplementary Table S3). The SSC contraction occurs mostly in the photosynthetic Vanilloideae. Vanilla madagascariensis, a photosynthetic member of Vanilleae, possesses the shortest SSC, which is 1.2 kb in length. The longest SSC of the photosynthetic Vanilloideae is 5.3 kb in Pogonia japonica. The SSC contraction event in photosynthetic Vanilloideae occurs in two tribes (Pogonieae and Vanilleae). Interestingly, the two tribes experience different levels of SSC contraction. In Pogonieae, Pogonia's SSC is 5.3 kb in length. However, the length of Vanilla's SSC ranges from 1.2-2.1 kb. Pogonia is treated as a basal taxon in the maximum likelihood and Bayesian phylogenetic trees. Because the SSC of Vanilla is shorter than Pogonia's, SSC contraction may be more extreme in the crown taxa of Vanilloideae.
The SSC contractions are common events in Vanilloideae. However, the SSC contraction in photosynthetic orchids is a rare evolutionary event from three other orchid subfamilies. It has only been reported in three species of Paphiopedilum (Cypripedioideae) and one species of Hetaeria (Orchidoideae) [16,20,29]. Paphiopedilum's SSC ranges from 2.4-5.1 kb, and Hetaeria's SSC is 2.3 kb. In photosynthetic orchids, SSC contraction is found in only three genera (Paphiopedilum, Pogonia, and Vanilla). Though very few cases are observed in photosynthetic orchids, most mycoheterotrophic orchids, such as Neottia (N. acuminata, N. camtschatea, N. listeroides, and N. nidus-avis), Epipogium (E. aphyllum, and E. roseum) have plastomes with shortened SSCs [11,15,26]. However, SSC contraction in mycoheterotrophs is an additional phenomenon that comes after gene reduction. Moreover, mycoheterotrophic Vanilloideae (Cyrtosia, Lecanorchis) has an SSC of 10-14 kb in length [25]. The SSC contraction event might be an independent event in Vanilloideae.
Another important event of the Vanilloideae plastome is gene reduction. In the case of Vanilloideae plastome, ndh genes are deleted or pseudogenized like in other Orchidaceae genera, such as Cymbidium, Goodyera, and Neofinetia [17,21,22,29]. Except for the ccsA gene of Vanilla madagascariensis (which is affected by SSC contraction), gene components of photosynthetic Vanilloideae (Pogonia and Vanilla) are similar to those of other photosynthetic orchids, such as Habenaria, Phalaenopsis [9,39].
Genome rearrangement is another notable feature of the Vanilloideae plastome. In order to document the structural rearrangement, the Vanilloideae and Apostasioideae plastomes were compared. A total of ten rearrangements were found ( Table 2) among species. The ten rearrangements have been plotted in the phylogenetic tree ( Figure 6). In the genus Neuwiedia of Apostasioideae, the plastome structure is identical to other typical orchid plastomes. No structural rearrangements or gene reduction occurred in this taxon. However, Apostasia, the other genus of Apostasioideae, shares two rearrangements with Pogonia and Cyrtosia of Vanilloideae: rearrangement A (ycf 3-trnS-GGA, 3.282 kb) and rearrangement B (trnS-GCU-ycf 3, 58.591 kb). One more independent rearrangement (rearrangement D-accD, 6.658 kb) is detected in Apostasia.

Evolutionary Rates and Selection Pressure
Eight relocated regions were affected by IR expansion/contraction (the ccsA, rpl22, rps15, and ycf1 regions moved to IR; the rpl2, rps7, rps12-3′, and rps19 regions moved to SC). The substitution rates of those eight regions were compared between the reference Pogonia and Vanilla share three rearrangements: G (rpl32-ndhF, 4.626 kb), I (ycf 1-ndhA, 6.469 kb), and J (ycf 1, 7.284 kb). Cyrtosia shares rearrangement I and J with Pogonia and Vanilla. Lecanorchis shares rearrangement J only with other Vanilloideae. Pogonia independently has rearrangement H (trnL-UAG-ndhA, 8.899 kb). This different rearrangement sharing may be caused by variations in SSC length. The Pogonia plastome has a longer SSC than Vanilla. In it, the ndhA region is generally located in the SSC region. Pogonia's SSC region is long enough to preserve the ndhA region, but Vanilla's SSC region is not. Thus, rearrangement H, which includes the ndhA region, occurs in the genus Pogonia only. In the case of mycoheterotrophic species, Cyrtosia has three independent rearrangements: rearrangements C (trnS-GCU-rbcL, 24.647 kb), E (ndhB-rps12, 4.626 kb), and F (trnV-GAC, 4.037 kb)). Lecanorchis does not show any other independent rearrangement due to its extremely degraded plastome. Thus, it seems likely that Lecanorchis had the common Vanilloideae rearrangements (such as G, H, I, and J) first and later experienced rearrangement losses due to the gene losses ( Figure 6). However, it is difficult to argue which events occurred first due to the limited plastome sequence availability.

Evolutionary Rates and Selection Pressure
Eight relocated regions were affected by IR expansion/contraction (the ccsA, rpl22, rps15, and ycf 1 regions moved to IR; the rpl2, rps7, rps12-3 , and rps19 regions moved to SC). The substitution rates of those eight regions were compared between the reference taxa (typical position of regions) and test groups (relocated). The nonsynonymous substitution rates (dN) are not different between the reference and test groups. However, the synonymous substitution rates (dS) are much lower in the test group than in the reference group in the case of the regions that were moved to the IR region. Similarly, the other four regions of the test group, which were moved to the SC region, show higher dS values than the reference group (Figure 4).
In the general angiosperm plastome, the IR strengthens the stability of the plastome structure. This structural stability is commonly supported by the copy-dependent repair mechanism (flip-flop recombination) stability [46,47]. For that reason, regions in IR show lower substitution rates or genetic diversity. Previous studies of other angiosperms, such as Sesamum (Lamiaceae) and Pelargonium (Polygonaceae), support the lower substitution rates of IR regions [48,49].
Mycoheterotrophic species usually have decreased plastome size and fewer plastome genes. Previous studies categorized mycoheterotrophic species by the level of plastome degradation [40,43]. According to those criteria, the Lecanorchis plastome can be categorized as stage 4 degradation, also known as the "stationary" stage [43]. In twenty retaining gene regions of Lecanorchis, the dN and dS substitution rates were compared between the test (mycoheterotrophic taxa; Cyrtosia and Lecanorchis) and reference groups (other photosynthetic relatives). Nineteen regions, except for clpP, have much higher substitution rates. There was a similar result in the genus Rhizanthella, which is an obligate mycoheterotroph of Ochidoideae [24]. In the case of the mycoheterotrophic Corallorhiza, however, it is reported that the w-ratio is not different from its photosynthetic relatives [50]. Although almost every region (except the only clpP region) shows elevated substitution rates, there are just five statistically significant regions in both the dS and dN comparisons (clpP, rpl16, rps7, rps12-5 , and ycf 2). Therefore, this is not enough evidence to claim that the mycoheterotrophic life cycle affects retention regions' elevated substitution rates.
The elevated substitution rates may be affected by selection pressure. In mycoheterotrophs, two regions (clpP and ycf 2) show significant p-values (p-value < 0.05), which were acquired by RELAX. Those two regions were under relaxation selection pressure (K < 1), which means that the surviving regions in mycoheterotrophic species' plastomes tend to be removed. Four regions under the position effect (rpl2, rps15, rps12-3 , and ycf 1) showed significant p-values. Two regions (rps15 and ycf 1), which were moved to the IR region from the SC region, experienced relaxation selection (K < 1), and the substitution rates were decreased in both. The other two regions (rpl2 and rps12-3 ) show accelerated substitution rates but different selection pressure: rpl2 experienced intensification selection pressure, and rps12-3 experienced relaxation selection pressure.

Phylogenetic Position and Time Estimation
In the maximum likelihood and Bayesian inference phylogenetic trees, Epipogium, Gastrodia, and Rhizanthella, which are mycoheterotrophs, have more elongated branch lengths compared to other photosynthetic taxa (Figure 3). Though the above three genera have extremely degraded plastomes, they have much fewer informative sites of the data matrix than other taxa. This leads to an elongated branch in the phylogenetic tree. However, mixed genera, such as Corallorhiza or Neottia, which contain both the photosynthetic and mycoheterotrophic species, have similar branch lengths to other typical clades, which are not significantly different. Because Vanilloideae also contains photosynthetic and mycoheterotrophic species, we also expected a similar branch length. However, the actual branch length of Vanilloideae seems to be elongated, especially in the genera with extremely degraded plastomes (Cyrtosia and Lecanorchis). Moreover, the branch length is longer in Vanilleae than in Pogonieae. This is presumed to be because the mycoheterotrophic genera are included in Vanilleae.
The whole chloroplast genome was used to estimate the derived time of Orchidaceae [20,30,51]. Vanilloideae was derived at 76.09 (57.6-94.0) mya (Supplementary Figure S4) in this study. Though this estimation is much more recent than some from other previous studies, it is within the error range [20,30]. Cyrtosia and Lecanorchis were derived at 32.75 (18.49-52.45)  Old clades tend to have severe plastome degradation since they have had enough time to accumulate mutations [43].

Conclusions
We reported six new chloroplast genomes (plastomes) of vanilloids (two Lecanorchis, two Pogonia, and two Vanilla species). Among these plastomes, Pogonia japonica has the longest plastome, with 158,200 bp in genome size. In contrast, Lecanorchis japonica has the shortest plastome, with 70,498 bp in genome size. The vanilloid plastomes have a typical quadripartite structure, but the small single copy (SSC) region was drastically reduced. The two different tribes of Vanilloideae (Pogonieae and Vanilleae) showed different levels of SSC reduction. In addition, various gene losses were observed among the vanilloid plastomes. The photosynthetic vanilloids (Pogonia and Vanilla) showed signs of stage 1 degradation and had lost most of their ndh genes. However, the other three species (one Cyrtosia and two Lecanorchis species) had stage 3 or stage 4 degradation and had lost almost all the genes in their plastomes, except some housekeeping genes. A total of ten genome rearrangements were found among ten Vanilloideae plastomes when compared to the basal Apostasioideae plastomes. Both synonymous and nonsynonymous substitution rates of the IR in-cooperated SC sub-regions were decelerated, while the substitution rates of the SC in-cooperated IR sub-regions were accelerated. A total of 20 protein-coding genes commonly remained in mycoheterotrophic vanilloids. Almost all these protein genes show accelerated base substitution rates compared to photosynthetic vanilloids. Our data clearly supports the different evolutionary modes of plastomes between photosynthetic and nonphotosynthetic lineages.

Plant Material and DNA Extraction
The plant leaf materials used in this study and their voucher information are given in Table 1. Five plant samples were collected in nature, and one plant sample (Pogonia japonica) was obtained from artificial cultivation. All voucher specimens and DNAs were deposited into the KUS herbarium and the Plant DNA Bank in Korea. Fresh leaf materials were ground into a powder with liquid nitrogen in a mortar. The ground samples were used to extract genomic DNA, using a G-spin™II for Plant Genomic DNA extraction kit (iNtRON, Seoul, Korea). The quality of genomic DNA was checked with a UV/VIS spectrophotometer (Thermo, Wilmington, DE, USA).

Sequencing, Assembly, and Annotation
Two samples-Vanilla madagascariensis and V. planifolia-were sequenced by the Illumina HiSeq 2000 (Illumina, San Diego, CA, USA). The other four samples-Lecanorchis kiusiana, L. japonica, Pogonia japonica, and P. minor-were sequenced by the Illumina MiSeq (Illumina, San Diego, CA, USA). The raw reads from the Illumina HiSeq were trimmed by Geneious 6.1.8 [52] with an option of 0.05 for the error probability limitation. Trimmed reads were assembled by reference guided assembly using a Geneious assembler; Vanilla planifolia (NC026778) was used as a reference sequence. The raw reads from the Illumina MiSeq were trimmed by BBDuk 37.64 and imported in Geneious 11.1.5 (Kmer length of 27). The trimmed reads were normalized by BBNorm 37.64 (with a target coverage level of 30 and a minimum depth of 12). Normalized reads were used to perform the de novo assembly by the Geneious assembler. Both result contigs, which came from HiSeq and MiSeq, were used as reference sequences to filter the trimmed reads. Filtered reads were re-used to conduct the de novo assembly using a Geneious assembler to construct a complete plastome sequence. Obtained plastome sequences were compared with a previous result contig and assembled by reference guided assembly. Two Lecanorchis reads were reassembled to check the plastome structure using SPAdes 3.10.0 with error correction and the assembling method.
Complete plastome sequences were annotated with BLASTn, tRNAscan-SE 2.0 [53], ORF finder, and the find annotation function in Geneious 11.1.5 (NC026778-Vanilla planifolia was used as a reference). The alternative start codons, such as ACG and TTG, were also included in the ORF finder. In order to judge a protein-coding gene, pseudogene, and its absence, we used the criteria from a previous study [20]. Translated ORFs were extracted to perform psi-blast [54] with orchid plastome CDS. Circular plastome maps were constructed using the OGdraw web server [55].

Phylogenetic Analysis
In addition to our 6 vanilloid plastomes, 83 plastome sequences were downloaded from the NCBI to perform a phylogenetic analysis (Supplementary Table S3, 74 sequences for Orchidaceae, four species for Asparagales, one species for Liliales, and four species for Vanilloideae). The 79 CDS and four rRNA genes were extracted from the 89 plastomes. Extracted gene regions were aligned using MAFFT [56]. Furthermore, 83 independent gene alignments were concatenated to be a single length of 87,230 bp. The concatenated alignment was used to perform jModeltest2 [57] in the CIPRES science gateway [58] to obtain the best model. The best-fit model for concatenated data was the GTR + G + I model. A maximum likelihood (ML) tree was constructed using RaxML-HPC2 on XSEDE in the CIPRES science gateway [59] and 100 bootstrap replicates of ML optimization likelihood (−563,601.727423). The extracted 83 regions were aligned by MUSCLE [60] and used to perform PartitionFinder2 [61] to estimate the best-fit model by region. Eighty-three alignments were concatenated by the best-fit model (4.5S rRNA, 5S rRNA, atpH, atpI, and psbN for the GTR all-equal model; clpP, petB, petG, petN, psbF, psbJ, psbL, and psbT for the HKY estimated model; and the other 70 regions for the GTR estimated model). Concatenated alignments (83,665 bp length of the GTR estimated model, 1410 bp of the GTR all-equal model, and 2155 bp of the HKY estimated model) were used to construct a Bayesian inference phylogenetic tree by using MRBayes_CIPRES api [62,63] with 1,000,000 of chain length.
IRScope [64] was used to describe the SC-IR junction region among the Vanilloideae and four Apostasioideae species. All those fourteen species' general features, such as the length of LSC/IR/SSC, were displayed on a graph by the ggplot2 package in R [65]. The progressiveMauve algorithm [66] was used to check gene relocation and plastome rearrangement among 14 species of Apostasioideae and Vanilloideae. The gene contents of 42 species in Orchidaceae were visualized as a heatmap by R.
Ten species of Vanilloideae and four species of Apostasioideae were used to estimate the synonymous and nonsynonymous substitution rates. Each region was aligned by MUSCLE [60]. The alignments were used to construct the maximum likelihood tree by the RaxML tree builder plugin in Geneious 11.1.5 with the GTR + G + I model. Alignments and phylogenetic trees were used to estimate the synonymous and nonsynonymous substitution rates by pamlX [67]. In order to estimate the position effect, regions were divided into two categories based on their movement from the original position to the IR region and from the original position to the SC region (LSC or SSC). A total of eight regions were counted as rearranged. Four regions-rpl2, rps7, rps12-3 , and rps19-moved to the SC region. Meanwhile, another four regions-ccsA, rpl22, rps15, and ycf 1-moved to the IR region. One ratio omega model along all species was used as a null model. This alternative model adopts a different omega value between moved taxa (to the SC or IR regions) and consistent taxa (runmode = 0, model = 1, codon frequency = 3). The likelihood ratio test was performed to validate the statistical difference between the null model and the alternative model (Supplementary Table S2).
In order to estimate the mycoheterotrophic effect, the remaining 20 regions in all 14 species were used. Each sequence was aligned by MUSCLE, and the alignments were used to construct a maximum likelihood tree using the RaxML tree builder plugin with the GTR + G + I model. Alignments and phylogenetic trees were used to estimate the synonymous and nonsynonymous substitution rates by pamlX. One ratio omega model along all species was used as a null model. The alternative model adopts different omega values between mycoheterotrophic and photosynthetic taxa (runmode = 0, model = 1, codon frequency = 3). The likelihood ratio test was performed to validate the statistical difference between the null model and the alternative model (Supplementary Table S2).
A total of 28 regions (20 regions from the mycoheterotrophic constant region and eight regions for rearrangement) were used to perform RELAX in the Datamonkey server [68][69][70][71]. The selection intensity parameter and synonymous/nonsynonymous substitution rates between the reference branch (photosynthetic plants or constant regions) and test branch (mycoheterotrophic plants or rearranged regions) were checked.

Time Estimation
The alignments used in the Bayesian inference tree construction were used to estimate the divergence time estimation. Eighty-three regions were used (4.5S rRNA, 5S rRNA, atpH, atpI, and psbN for the GTR all-equal model; clpP, petB, petG, petN, psbF, psbJ, psbL, and psbT for the HKY estimated model; and the other 70 regions for the GTR estimated model). The XML file was prepared by BEAUti 2.5.2 with three fossil data to calibrate the nodes (Asparagales, normal distribution, mean 105.3, sigma 8.0; Dendrobium, log-normal distribution, sigma 2.0, offset 23.2; Goodyera, log-normal distribution, sigma 2.0, offset 15.0) [51,72,73]. A relaxed clock log-normal model [74] and a yule model were chosen to perform Markov chain Monte Carlo (MCMC) 100,000,000 times. Two independent analyses were performed by BEAST2-XSEDE in the CIPRES science gateway [75]. Trees and logs were collected for every 5000 generations.
The total log file was tested to check the effective sample size (ESS) by Tracer v1.6 [76], and the major parameters (posterior, prior, likelihood) were over 100 of the ESS. Tree files were concatenated by LogCombiner v2.5.2 [77] with an option of a 15% burn-in. A concatenated tree file was treated by TreeAnnotator [78] in the CIPRES science gateway with an option of 0.95 posterior probability. The concatenated maximum clade credibility tree was annotated by FigTree v1.4 [79] and two R packages of 'phytools' and 'ape' [80,81]. comments to improving the manuscript. We also would like to thank to Kyeongwon Kang of Babo orchid Farm for the materials of Pogonia japonica. We thank the curator of the Korea University Herbarium (KUS) for voucher specimen preparation. The genomic DNAs are deposited in the Plant DNA Bank in Korea (PDBK).

Conflicts of Interest:
The authors declare no conflict of interest.